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The maximum common substructure (MCS) problem is 
of great importance in multiple aspects of chemoinfor- 
matics. It has diverse applications ranging from lead pre- 
diction to automated reaction mapping and visual 
alignment of similar compounds. Many different algo- 
rithms have been developed [1], both exact and approxi- 
mate. Since the MCS problem is NP-complete, the strict 
time constraints of most applications can only be realisti- 
cally satisfied by fast and robust approximation methods. 

We developed two efficient heuristic algorithms. One 
is based on the popular approach of reducing the MCS 
problem to finding the maximum clique in the modular 
product of the two molecule graphs. The other is based 
on a new algorithm by Kawabata, called the build-up 
method [2]. We also incorporated other techniques, for 
example, the topological fingerprinting primarily used in 
substructure and similarity searching. We optimized our 
implementations for use in multiple applications devel- 
oped at ChemAxon. In some applications, for example, 
hierarchical MCS-based clustering or similarity search in 
large databases, the algorithms are required to give close 
to optimal results in limited time. To meet these con- 
flicting demands, our implementations were enhanced 
with strong heuristics. Upper bound calculation meth- 
ods were also applied for screening and early termina- 
tion purposes. 

In other applications, for example, reaction mapping or 
visual alignment, the challenge is that topological features 
must also be taken into account. Apart from the size of 
the found common substructure, the determined one-to- 
one correspondence between the atoms of the molecules 
is also very important. Effective heuristics were developed 
to guide the algorithms to prefer those solutions in which 


the relative positions of the common fragments of the 
input molecules are as similar as possible. 

Our implementations have been thoroughly tested and 
benchmarked. They have also been compared to publicly 
available solution methods, and integrated into different 
products at ChemAxon. This has shown that the pre- 
sented MCS algorithms can adequately cover the conflict- 
ing requirements of typical applications. We present our 
methods and heuristics along with their effects on running 
time, memory usage, as well as the size and features of the 
result. 
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